Inductive Lexica

نویسندگان

  • Walter Daelemans
  • Gert Durieux
چکیده

Machine Learning techniques are useful tools for the automatic extension of existing lexical databases. In this paper, we review some symbolic machine learning methods which can be used to add new lexical material to the lexicon by automatically inducing the regularities implicit in lexical representations already present. We introduce the general methodology for the construction of inductive lex-ica, and discuss empirical results on extending lexica with two types of information: pronunciation and gender.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transforming Lexica as Trees

We investigate the problem of structurally changing lexica, while preserving the information. We present a type of lexicon transformation that is complete on an interesting class of lexica. Our work is motivated by the problem of merging one or more lexica into one lexicon. Lexica, lexicon schemas, and lexicon transformations are all seen as particular kinds of trees.

متن کامل

Large lexica for speech-to-speech translation: from specification to creation

This paper presents the corpora collection and lexica creation for the purposes of Automatic Speech Recognition (ASR) and Text-to-speech (TTS) that are needed in speech-to-speech translation (SST). These lexica will be specified, built and validated within the scope of the EU-project LC-STAR (Lexica and Corpora for Speech-to-Speech Translation Components) during the years 2002-2005. Large lexic...

متن کامل

xLiD-Lexica: Cross-lingual Linked Data Lexica

In this paper, we introduce our cross-lingual linked data lexica, called xLiD-Lexica, which are constructed by exploiting the multilingual Wikipedia and linked data resources from Linked Open Data (LOD). We provide the cross-lingual groundings of linked data resources from LOD as RDF data, which can be easily integrated into the LOD data sources. In addition, we build a SPARQL endpoint over our...

متن کامل

Lexicon and Corpora for Speech to Speech Translation (LC-STAR)

The objective of the EU-project LC-STAR (Lexica and Corpora for Speech-to-Speech Translation Components) is corpora collection and lexica creation for the purposes of Automatic Speech Recognition (ASR) and Text-to-speech (TTS) that are needed in speech-to-speech translation (SST). During the lifetime of the project (2002-2005) these lexica will be specified, built and validated. Large lexica co...

متن کامل

Creation and Validation of Large Lexica for Speech-to-Speech Translation Purposes

This paper presents specifications and requirements for creation and validation o f large lexica that are needed in automatic Speech Recognition (ASR), Text-to-Speech (TTS) and statistical Speech-to-Speech Translation (SST) systems . The prepared language resources are created and validated within the scope o f the EU-project LC-STAR (Lexica and Corpora for Speech-toSpeech Translation Component...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000